7 research outputs found

    Atomic: an open-source software platform for multi-level corpus annotation

    Get PDF
    This paper presents Atomic, an open-source platform-independent desktop application for multi-level corpus annotation. Atomic aims at providing the linguistic community with a user-friendly annotation tool and sustainable platform through its focus on extensibility, a generic data model, and compatibility with existing linguistic formats. It is implemented on top of the Eclipse Rich Client Platform, a pluggable Java-based framework for creating client applications. Atomic - as a set of plug-ins for this framework - integrates with the platform and allows other researchers to develop and integrate further extensions to the software as needed. The generic graph-based meta model Salt serves as Atomic’s domain model and allows for unlimited annotation levels and types. Salt is also used as an intermediate model in the Pepper framework for conversion of linguistic data, which is fully integrated into Atomic, making the latter compatible with a wide range of linguistic formats. Atomic provides tools for both less experienced and expert annotators: graphical, mouse-driven editors and a command-line data manipulation language for rapid annotation

    Decomposing hierarchical alignment: Co-arguments as conditions on alignment and the limits of referential hierarchies as explanations in verb agreement

    Get PDF
    Apart from common cases of differential argument marking, referential hierarchies affect argument marking in two ways: (a) through hierarchical marking, where markers compete for a slot and the competition is resolved by a hierarchy, and (b) through co-argument sensitivity, where the marking of one argument depends on the properties of its co-argument. Here we show that while co-argument sensitivity cannot be analyzed in terms of hierarchical marking, hierarchical marking can be analyzed in terms of co-argument sensitivity. Once hierarchical effects on marking are analyzed in terms of co-argument sensitivity, it becomes possible to examine alignment patterns relative to referential categories in exactly the same way as one can examine alignment patterns relative to referential categories in cases of differential argument marking and indeed any other condition on alignment (such as tense or clause type). As a result, instances of hierarchical marking of any kind turn out not to present a special case in the typology of alignment, and there is no need for positing an additional non-basic alignment type such as "hierarchical alignment”. While hierarchies are not needed for descriptive and comparative purposes, we also cast doubt on their relevance in diachrony: examining two families for which hierarchical agreement has been postulated, Algonquian and Kiranti, we find only weak and very limited statistical evidence for agreement paradigms to have been shaped by a principled ranking of person categories

    Semantic role clustering: an empirical assessments of semantic role types in non-default case assignment

    Full text link
    This paper seeks to determine to what extent there is cross-linguistic evidence for postulating clusters of predicate-specific semantic roles such as experiencer, cognizer, possessor, etc. For this, we survey non-default case assignments in a sample of 141 languages and annotate the associated predicates for cross-linguistically recurrent semantic roles, such as ‘the one who feels cold’, ‘the one who eats sth.’, ‘the thing that is being eaten’. We then determine to what extent these roles are treated alike across languages, i.e. repeatedly grouped together under the same non-default case marker or under the same specific alternation with a non-default marker. Applying fuzzy cluster and NeighborNet algorithms to these data reveals cross-linguistic evidence for role clusters around experiencers, undergoers of body processes, and cognizers/perceivers in one- and two-place predicates; and around sources and transmitted speech in three-place predicates. No support emerges from non-default case assignment for any other role clusters that are traditionally assumed (e.g. for any distinctions among objects of two-argument predicates, or for distinctions between themes and instruments)

    Decomposing hierarchical alignment: Co-arguments as conditions on alignment and the limits of referential hierarchies as explanations in verb agreement

    Get PDF
    Apart from common cases of differential argument marking, referential hierarchies affect argument marking in two ways: (a) through hierarchical marking, where markers compete for a slot and the competition is resolved by a hierarchy, and (b) through co-argument sensitivity, where the marking of one argument depends on the properties of its co-argument. Here we show that while co-argument sensitivity cannot be analyzed in terms of hierarchical marking, hierarchical marking can be analyzed in terms of co-argument sensitivity. Once hierarchical effects on marking are analyzed in terms of co-argument sensitivity, it becomes possible to examine alignment patterns relative to referential categories in exactly the same way as one can examine alignment patterns relative to referential categories in cases of differential argument marking and indeed any other condition on alignment (such as tense or clause type). As a result, instances of hierarchical marking of any kind turn out not to present a special case in the typology of alignment, and there is no need for positing an additional non-basic alignment type such as “hierarchical alignment”. While hierarchies are not needed for descriptive and comparative purposes, we also cast doubt on their relevance in diachrony: examining two families for which hierarchical agreement has been postulated, Algonquian and Kiranti, we find only weak and very limited statistical evidence for agreement paradigms to have been shaped by a principled ranking of person categories

    Enriching TimeBank: Towards a more precise annotation of temporal relations in a text

    No full text
    <p>We propose a way of enriching the TimeML annotations of TimeBank by adding information about the Topic Time in terms of Klein (1994). The annotations are partly automatic, partly inferential and partly manual. The corpus was converted into the native format of the annotation software GraphAnno and POS-tagged using the Stanford bidirectional dependency network tagger. On top of each finite verb, a FIN-node with tense information was created, and on top of any FIN-node, a TOPICTIME-node, in accordance with Klein's (1994) treatment of finiteness as the linguistic correlate of the Topic Time. Each TOPICTIME-node is linked to a MAKEINSTANCE-node representing an (instantiated) event in TimeML (Pustejovsky et al. 2005), the markup language used for the annotation of TimeBank. For such links we introduce a new category, ELINK. ELINKs capture the relationship between the Topic Time (TT) and the Time of Situation (TSit) and have an aspectual interpretation in Klein's (1994) theory. In addition to these automatic and inferential annotations, some TLINKs were added manually. Using an example from the corpus, we show that the inclusion of the Topic Time in the annotations allows for a richer representation of the temporal structure than does TimeML. A way of representing this structure in a diagrammatic form similar to the T-Box format (Verhagen, 2007) is proposed.</p
    corecore